ci: add GitHub Actions workflow to run tests on PRs#99
ci: add GitHub Actions workflow to run tests on PRs#99mrshu wants to merge 1 commit intoevaleval:mainfrom
Conversation
7a61a37 to
9cd77ab
Compare
Previously no CI workflow ran pytest on pull requests, so regressions could land undetected. This adds a test.yml workflow triggered on PRs and pushes to main. - Split into three parallel jobs: core, inspect, helm - Core job runs tests that need no optional dependencies - Inspect and HELM jobs install their respective extras - Pin action versions to match regenerate_types.yml - Add timeout-minutes: 10 to guard against hung jobs - Update uv.lock Closes evaleval#88
9cd77ab to
7f66665
Compare
|
@nelaturuharsha not sure if this is what you had in mind -- happy to update it in any way you'd prefer! |
|
Would it make sense to install all requirements and then just run all tests? I worry that having lines like: will not be updated if more files with relevant inspect tests get added. It would feel safer to me if the test command was just |
Erotemic
left a comment
There was a problem hiding this comment.
I think the current PR could be accepted as is just to get something running, but I think we should avoid the globs if possible, and I think that adding locked/loose and core/full runs gives the best bang for the buck in terms of running different flavors of CI.
| name: Tests | ||
|
|
||
| on: | ||
| pull_request: | ||
| push: | ||
| branches: [main] | ||
|
|
||
| jobs: | ||
| core: | ||
| name: Core tests | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 10 | ||
| steps: | ||
| - uses: actions/checkout@v6.0.2 | ||
| - uses: astral-sh/setup-uv@v7.6.0 | ||
|
|
||
| - name: Install core dependencies | ||
| run: uv sync --locked | ||
|
|
||
| - name: Run core tests | ||
| run: | | ||
| uv run pytest \ | ||
| tests/test_validate.py \ | ||
| tests/test_check_duplicate_entries.py \ | ||
| tests/test_inspect_uuid_utils.py \ | ||
| tests/test_cli_inspect_uuid.py \ | ||
| tests/test_lm_eval_adapter.py \ | ||
| -v | ||
|
|
||
| inspect: | ||
| name: Inspect converter tests | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 10 | ||
| steps: | ||
| - uses: actions/checkout@v6.0.2 | ||
| - uses: astral-sh/setup-uv@v7.6.0 | ||
|
|
||
| - name: Install dependencies with inspect extra | ||
| run: uv sync --locked --extra inspect | ||
|
|
||
| - name: Run inspect tests | ||
| run: | | ||
| uv run pytest \ | ||
| tests/test_inspect_adapter.py \ | ||
| tests/test_inspect_instance_level_adapter.py \ | ||
| -v | ||
|
|
||
| helm: | ||
| name: HELM converter tests | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 10 | ||
| steps: | ||
| - uses: actions/checkout@v6.0.2 | ||
| - uses: astral-sh/setup-uv@v7.6.0 | ||
|
|
||
| - name: Install dependencies with helm extra | ||
| run: uv sync --locked --extra helm | ||
|
|
||
| - name: Run HELM tests | ||
| run: | | ||
| uv run pytest \ | ||
| tests/test_helm_adapter.py \ | ||
| tests/test_helm_instance_level_adapter.py \ | ||
| -v |
There was a problem hiding this comment.
| name: Tests | |
| on: | |
| pull_request: | |
| push: | |
| branches: [main] | |
| jobs: | |
| core: | |
| name: Core tests | |
| runs-on: ubuntu-latest | |
| timeout-minutes: 10 | |
| steps: | |
| - uses: actions/checkout@v6.0.2 | |
| - uses: astral-sh/setup-uv@v7.6.0 | |
| - name: Install core dependencies | |
| run: uv sync --locked | |
| - name: Run core tests | |
| run: | | |
| uv run pytest \ | |
| tests/test_validate.py \ | |
| tests/test_check_duplicate_entries.py \ | |
| tests/test_inspect_uuid_utils.py \ | |
| tests/test_cli_inspect_uuid.py \ | |
| tests/test_lm_eval_adapter.py \ | |
| -v | |
| inspect: | |
| name: Inspect converter tests | |
| runs-on: ubuntu-latest | |
| timeout-minutes: 10 | |
| steps: | |
| - uses: actions/checkout@v6.0.2 | |
| - uses: astral-sh/setup-uv@v7.6.0 | |
| - name: Install dependencies with inspect extra | |
| run: uv sync --locked --extra inspect | |
| - name: Run inspect tests | |
| run: | | |
| uv run pytest \ | |
| tests/test_inspect_adapter.py \ | |
| tests/test_inspect_instance_level_adapter.py \ | |
| -v | |
| helm: | |
| name: HELM converter tests | |
| runs-on: ubuntu-latest | |
| timeout-minutes: 10 | |
| steps: | |
| - uses: actions/checkout@v6.0.2 | |
| - uses: astral-sh/setup-uv@v7.6.0 | |
| - name: Install dependencies with helm extra | |
| run: uv sync --locked --extra helm | |
| - name: Run HELM tests | |
| run: | | |
| uv run pytest \ | |
| tests/test_helm_adapter.py \ | |
| tests/test_helm_instance_level_adapter.py \ | |
| -v | |
| name: Tests | |
| on: | |
| pull_request: | |
| push: | |
| branches: [main] | |
| permissions: | |
| contents: read | |
| jobs: | |
| tests: | |
| name: ${{ matrix.deps }} / ${{ matrix.resolution }} | |
| runs-on: ubuntu-latest | |
| timeout-minutes: 10 | |
| strategy: | |
| fail-fast: false | |
| matrix: | |
| include: | |
| - deps: core | |
| resolution: locked | |
| sync: uv sync --locked | |
| pytest_args: >- | |
| tests | |
| --ignore-glob=tests/test_inspect*.py | |
| --ignore-glob=tests/test_helm*.py | |
| -v | |
| - deps: core | |
| resolution: loose | |
| sync: uv sync --upgrade | |
| pytest_args: >- | |
| tests | |
| --ignore-glob=tests/test_inspect*.py | |
| --ignore-glob=tests/test_helm*.py | |
| -v | |
| - deps: full | |
| resolution: locked | |
| sync: uv sync --locked --all-extras | |
| pytest_args: tests -v | |
| - deps: full | |
| resolution: loose | |
| sync: uv sync --upgrade --all-extras | |
| pytest_args: tests -v | |
| steps: | |
| - uses: actions/checkout@v6.0.2 | |
| - uses: astral-sh/setup-uv@v7.6.0 | |
| - name: Install dependencies | |
| run: ${{ matrix.sync }} | |
| - name: Run tests | |
| run: uv run pytest ${{ matrix.pytest_args }} |
There was a problem hiding this comment.
This is the style of CI that I like to use. I also like to use a "core" or "minimal" test of tests that just target the plain library without any extras. When dealing with extras instead of worrying about diffrent combinations I like to use a "full" version where all the extras are installed. We could split out the inspect / helm versions but I think it's not worth the effort.
What I do think is worth the effort is a locked / loose install check. I find it extremely valuable to test the locked version of the deps as well as the deps that would be installed if a user pip installed today. This gives a strong signal when an upstream package causes a break in this package.
For now I think it's fine to use the --ignore-glob, but really we should just use pytest to mark which tests should / shouldn't run based on the installed packages, but the glob method is good enough for now.
Previously no CI workflow ran pytest on pull requests, so regressions could land undetected. This adds a test.yml workflow triggered on PRs and pushes to main.
Closes #88